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Abstract 

The joint modeling of longitudinal and time-to-event data is an active area of statistics re- 
search that has received a lot of attention in the recent years. More recently, a new and 
attractive application of this type of models has been to obtain individualized predictions of 
survival probabilities and/or of future longitudinal responses. The advantageous feature of 
these predictions is that they are dynamically updated as extra longitudinal responses are 
collected for the subjects of interest, providing real time risk assessment using all recorded 
information. The aim of this paper is two- fold. First, to highlight the importance of mod- 
eling the association structure between the longitudinal and event time responses that can 
greatly influence the derived predictions, and second, to illustrate how we can improve the 
accuracy of the derived predictions by suitably combining joint models with different asso- 
ciation structures. The second goal is achieved using Bayesian model averaging, which, in 
this setting, has the very intriguing feature that the model weights are not fixed but they are 
rather subject- and time-dependent, implying that at different follow-up times predictions 
for the same subject may be based on different models. 
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1 Introduction 



In recent years it has been recognized that personalized medicine forms the future of medical 
care. This has increased interest in development of prognostic models for many different 
types of diseases. Examples are numerous and include, prognostic models for various types 
of cancer, such as breast and prostate cancer; risk scores for cardiovascular diseases, such as 
the Framingham score; and prognostic models applied in AIDS research to assess the risk 
of HIV infected patients. However, even though there is a wealth of patient data available, 
the majority of prognostic models in the literature provide risk predictions using only a 
small portion of the recorded information. This is especially true for patient outcomes 
that are repeatedly measured in time where typically only the last measurement is utilized. 
A clear advantage of such simple models is that they can be applied in everyday clinical 
practice. However, an important limitation is that valuable information is discarded, which if 
appropriately used, could offer a better insight into the dynamics of the disease's progression. 
In particular, an inherent characteristic of many medical conditions, such as those described 
above, is their dynamic nature. That is, the rate of progression is not only different from 
patient to patient but also dynamically changes in time for the same patient. Hence, it is 
medically relevant to investigate whether repeated measurements of a biomarker can provide 
a better understanding of disease progression and a better prediction of the risk for the event 
of interest than a single biomarker measurement (e.g., at baseline or the last available). 
It is evident that markers with this capability would become a valuable tool in everyday 
medical practice, because they would provide physicians with a better understanding of 
disease progression for a particular patient, and allow them to make more informed decisions. 
This is also the aim of our motivating case study on patients who underwent aortic valve 
allograft implantation, described in detail in Section 2. In particular, even though human 
tissue valves have some advantages, they are also more susceptible to tissue degeneration 
and require re-intervention. Hence, cardiologists wish to use an accurate prognostic tool that 
will inform them about the future prospect of a patient with a human tissue valve in order 
to adjust medical care and postpone re-operation and/or death. 
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Motivated by the Aortic Valve case study, this paper aims to provide a flexible modeling 
framework for producing risk predictions that utilize all available patient information. We 
will explicitly model the longitudinal history of each subject by basing our developments on 
the framework of joint models for longitudinal and time-to-event data (Faucett and Thomas 
1996; Wulfsohn and Tsiatis 1997; Henderson et al. 2000; Tsiatis and Davidian 2004; Guo 
and Carlin 2004; Rizopoulos 2012). In particular, an attractive use of joint models is to 
derive dynamic predictions for either the survival or longitudinal outcomes (Yu et al. 2008; 
Proust-Lima and Taylor 2009; Rizopoulos 2011). The advantageous features of these pre- 
dictions are that they are individualized, are updated as extra longitudinal information is 
recorded for each subject, and are based on all past values of the longitudinal outcome. 
Our novel contributions are two-fold. First, the subjects under consideration often exhibit 
complex longitudinal trajectories with nonlinearities and plateaus. In these settings, it is 
relevant to consider which characteristics of a subject's trajectory best predict the event of 
interest. To this end, we will investigate how predictions are affected by assuming different 
types of association structure between the longitudinal and event time processes. We go 
beyond the standard formulation of joint models (Henderson et al. 2000), and we postulate 
functional forms that allow the rate of increase/decrease of the longitudinal outcome or a 
suitable summary of the whole longitudinal trajectory to determine the risk for an event. 
The consideration of competing association structures to describe the link between the two 
processes raises the issue of model uncertainty, which is typically ignored in prognostic mod- 
eling. Motivated by this, our second contribution is to derive predictions based not on a 
single model but on a collection of models simultaneously, combining them using Bayesian 
model averaging. As we will show later this approach bases predictions are based on the 
available data of a subject and the model weights are both individual- and time-dependent. 

The rest of the paper is organized as follows. Section 2 gives a background on aor- 
tic allograft implantation and describes the Aortic Valve dataset that motivates this re- 
search. Section 3 briefly introduces the joint modeling framework, presents estimation under 
a Bayesian approach and shows how dynamic individualized predictions can be derived under 



3 



a joint model. Section 4 introduces several formulations of the association structure between 
the longitudinal and survival processes. Section 5 presents the Bayesian model averaging 
methodology to combine predictions, and Section 6 illustrates the use of this technique in 
the Aortic Valve dataset. Section 7 refers to the results of a simulation study, and Section 8 
concludes the paper. 

2 Background on the Aortic Valve Dataset 

The motivation for this research comes from a study, conducted by the Department of 
Cardio-Thoracic Surgery of the Erasmus Medical Center in the Netherlands that includes 
286 patients who received a human tissue valve in the aortic position in the hospital from 
1987 until 2008 (Bekkers et al. 2011). Aortic allograft implantation has been widely used 
for a variety of aortic valve or aortic root diseases. Initial reports on the use of either 
fresh or cryopreserved allografts date from the early years of heart valve surgery. Major 
advantages ascribed to allografts are the excellent hemodynamic characteristics as a valve 
substitute; the low rate of thrombo-embolic complications, and, therefore, absence of the 
need for anticoagulant treatment; and the resistance to endocarditis. A major disadvantage 
of using human tissue valves, however is the susceptibility to (tissue) degeneration and need 
for re-interventions. The durability of a cryopreserved aortic allograft is age-dependent, 
leading to a high lifetime risk of re-operation, especially for young patients. Re-operations 
on the aortic root are complex, with substantial operative risks, and mortality rates in the 
range 4-12%. 

In our study, a total of 77 (26.9%) patients received a sub-coronary implantation (SI) and 
the remaining 209 patients a root replacement (RR). These patients were followed prospec- 
tively over time with annual telephone interviews and biennial standardized echocardio- 
graphic assessment of valve function until July 8, 2010. Echo examinations were scheduled 
at 6 months and 1 year postoperatively, and biennially thereafter, and at each examina- 
tion, echocardiographic measurements of aortic gradient (mmHg) were taken. By the end of 
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follow-up, 1241 aortic gradient measurements were recorded with an average of 5 measure- 
ments per patient (s.d. 2.3 measurements), 59 (20.6%) patients had died, and 73 (25.5%) 
patients required a re-operation on the allograft. Following the discussion in Section 1, our 
aim here is to use the existing data to construct a prognostic tool that will provide accurate 
risk predictions for future patients from the same population, utilizing their baseline informa- 
tion, namely age, gender and the type of operation they underwent, and their recorded aortic 
gradient levels. The composite event re-operation or death was observed for 125 (43.7%) 
patients, and the corresponding Kaplan-Meier estimator for the two intervention groups is 
shown in Figure 1. We can observe minimal differences in the re-operation-free survival rates 




Follow-up Time (years) 



Figure 1: Kaplan-Meier estimates of the survival functions for re-operation-free survival for 
the sub-coronary implantation (SI) and root replacement (RR) groups. 

between sub-coronary implantation and root replacement, with only a slight advantage of 
sub-coronary implantation towards the end of the follow-up. For the longitudinal outcome, 
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Figure 2 depicts the subject-specific longitudinal profiles for the two intervention groups. 
Because aortic gradient exhibits right skewness, we use the square root transform of aortic 
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Figure 2: Subject-specific profiles for the square root aortic gradient separately for the sub- 
coronary implantation (SI) and root replacement (RR) groups. 

gradient. We observe considerable variability in the shapes of these trajectories, but there 
are no systematic differences apparent between the two groups. 



3 Joint Model Specification and Predictions 
3.1 Submodels 

Let T* denote the true event time for the i-th subject (i = l,...,n), Cj the censoring 
time, and Tj = min(T*,Cj) the corresponding observed event time. In addition, we let 
S{ = I(T* < Ci) denote the event indicator, with /(•) being the indicator function that takes 
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the value 1 when T* < Ci, and otherwise. For the longitudinal process, we let y i denote the 
fjj x 1 longitudinal response vector for the i-th subject, with element yu denoting the value 
of the longitudinal outcome taken at time point I = 1, . . . , rij. Our aim is to postulate a 
suitable joint model that associates these two processes. 

Focusing on normally distributed longitudinal outcomes, we use a linear mixed-effects 
model to describe the subject-specific longitudinal trajectories. Namely, we have 

Vi(t) = m i (t)+e i (t)=xj(t)p + zj(t)b i + e i (t), 
bi ~ Af(0,D), Siit) ~A/"(0,a 2 ), 

where yi(t) denotes the value of the longitudinal outcome at any particular time point t, 
Xi(t) and Zi(t) denote the time-dependent design vectors for the fixed-effects (3 and for the 
random effects bj, respectively, and the corresponding error terms that are assumed 
independent of the random effects, and cov{£j(t), £»(£')} = for t' ^ t. For the survival 
process, we assume that the risk for an event depends on the true and unobserved value of 
the marker at time t, denoted by m,i{t) in (1). More specifically, we have 

hAt I Mi(t),Wi) = \imPv{t<T* <t + s\T* >t,Mi(t),Wi}/s 

= h (t) exp{j T Wi + arrii(t)}, t > 0, (2) 

where A4i(t) = {mj(s),0 < s < t} denotes the history of the true unobserved longitudinal 
process up to t, h (-) denotes the baseline hazard function, and u>i is a vector of baseline 
covariates with corresponding regression coefficients 7. Parameter a quantifies the associa- 
tion between the true value of the marker at t and the hazard for an event at the same time 
point. To complete the specification of the survival process we need to make appropriate 
assumptions for the baseline hazard function h Q (-). Typically in survival analysis this is left 
unspecified. However, since our aim is to make subject-specific predictions of survival proba- 
bilities, we use a smoother assumption for this function. We achieve this, while still allowing 
for flexibility in the specification of ho(-) by using a B-splines approach. In particular, the 
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log baseline hazard function is expressed as 

Q 

log/ioOO = lh (h o + ^2lh , q B q (t,v), (3) 

where B g (t,v) denotes the g-th basis function of a B-spline with knots Vi,. . . ,vq and j ho 
the vector of spline coefficients. Increasing the number of knots Q increases the flexibility in 
approximating log/io(-); however, we should balance bias and variance and avoid overfitting. 
A standard rule of thumb is to keep the total number of parameters, including the parameters 
in the linear predictor in (2) and in the model for ho(-), between 1/10 and 1/20 of the total 
number of events in the sample (Harrell 2001, Section 4.4). After the number of knots has 
been decided, their location is based on percentiles of the observed event times Tj or of the 
true event times {Tj : T* < Ci,i = 1, . . . ,n} to allow for more flexibility in the region of 
greatest density. 

3.2 Estimation 

For the estimation of the joint model's parameters we follow the Bayesian paradigm and 
derive posterior inferences using a Markov chain Monte Carlo algorithm (MCMC). The like- 
lihood of the models is derived under the assumption that the vector of time-independent 
random effects 6j accounts for all interdependencies between the observed outcomes. That 
is, given the random effects, the longitudinal and event time process are assumed indepen- 
dent, and in addition, the longitudinal responses of each subject are assumed independent. 
Formally we have, 

piy^SilKO) = p(y i \b i ,O)p(T i ,8 i \b i ,0), (4) 

vivAKO) = Hp(ya\bi,e), (5) 

i 

where 6 T = (Oj , &y ,0 b ) denotes the full parameter vector, with d t denoting the parameters 
for the event time outcome, y the parameters for the longitudinal outcomes, and Of, the 
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unique parameters of the random-effects covariance matrix, and p(-) denotes an appropri- 
ate probability density function. In addition, we assume that given the observed history 
of longitudinal responses up to time s, the censoring mechanism and the visiting process 
are independent of the true event times and future longitudinal measurements at t > s. 
Under these assumptions, the likelihood contribution for the i-th subject conditional on the 
parameters and random effects takes the form: 



p{y i ,T h 8 i | 0,bi) = Y[p(yu I b i ;0 y )p(T i ,5 i | bi;0 t ,(3)p(bi;0 b 

1=1 

(a 2 )-"* 72 exp{- - xj t (3 - z^) 2 /2a 2 } 

i 

exp|^7 /l0i9J B 9 (T i , v) +j T w { + am i (T i )} 



x exp | - exp(7 1 wi) I exp j h0jq B q (s, v) + am^s) [> c/.s 

" i 



(6) 



x 



det(£>)- 1/2 exp(-&T D- 1 b i /2) 



where the intercept term 7^,0 from the definition of the baseline risk function (3) has been 
incorporated into the design vector tOj. The integral in the definition of the survival function 



Si(t I Mi(t), Wi) = exp j— J ho(s) exp{j T Wi + ami(s)}ds 



(7) 



does not have a closed-form solution, and thus a numerical method must be employed for 
its evaluation. Here we use a 15-point Gauss-Kronrod quadrature rule. For the parameters 
we take standard prior distributions. In particular, for the vector of fixed effects of the 
longitudinal submodel /3, for the regression parameters of the survival model 7, for the vector 
of spline coefficients for the baseline hazard 7 ho , and for the association parameter a we use 
independent univariate diffuse normal priors. For the variance of the error terms a 2 we take 
an inverse-Gamma prior, while for covariance matrices we assume an inverse Wishart prior. 
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3.3 Dynamic Individualized Predictions 

Under the Bayesian specification of the joint model, presented in Section 3, we can derive 
subject-specific predictions for either the survival or longitudinal outcomes (Yu et al. 2008; 
Rizopoulos 2011, 2012). To put it more formally, based on a joint model fitted in a sam- 
ple V n = {Tj, SijU^i = 1, . . . , n) from the target population, we are interested in deriving 
predictions for a new subject j from the same population that has provided a set of longitu- 
dinal measurements iVj(t) = {yj{s)', < s < £}, and has a vector of baseline covariates Wj. 
The fact that biomarker measurements have been recorded up to t, implies survival of this 
subject up to this time point, meaning that it is more relevant to focus on the conditional 
subject-specific predictions, given survival up to t. In particular, for any time u > t we are 
interested in the probability that this new subject j will survive at least up to u, i.e., 



Similarly, for the longitudinal outcome we are interested in the predicted longitudinal re- 
sponse at u, i.e., 



The time-dynamic nature of both 7tj(u \ t) and u>j(u \ t) is evident because when new 
information is recorded for patient j at time t' > t, we can update these predictions to 
obtain TCj(u \ t') and ooj(u | t'), and therefore proceed in a time dynamic manner. 

Under the joint modeling framework of Section 3, estimation of either 7Tj(u \ t) or ujj(u \ t) 
is based on the corresponding posterior predictive distributions, namely 



itjiu | t) = Pr(T; >u\T*> t^y^w^Vn). 



Ujiu I t) = E{ yj {u) | T* > t,y 3 (t),V n }. 




Pr(T* > u | t; > t,y j (t),6) P (6 I v n ) de 



for the survival outcome, and analogously 
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for the longitudinal one. The calculation of the first part of each integrand takes full ad- 
vantage of the conditional independence assumptions (4) and (5). In particular, we observe 
that the first factor of the integrand of TTj(u | t) can be rewritten by noting that: 



PrP7 > „ | t; > t ,y M , 9) = J^t; > . I t; > t, bMi | r; > t ,y M e 

Sj{u | Mj{u, bj), 6) 



dbj 



r sAu\M i (u,b i ),e\ , , 



whereas for | t) we similarly have: 



E{ yj (u) | T* >t,y j (t),6} = j E{y,{u) | b j ,6}p{b, \ T* > t, yj(t), 6) dbj 



x](u)/3 + z](u)b { .\ 



with 



if = J b jP (b j \T*>t,y j (t),o)db j . 



Combining these equations with the MCMC sample from the posterior distribution of the 
parameters for the original data we can devise a simple simulation scheme to obtain 
Monte Carlo estimates of TCj(u \ t) and Uj(u \ t). More details can be found in Yu et al. 
(2008) and Rizopoulos (2011, 2012). 



4 Functional Form 

In ordinary proportional hazards models it has been long recognized that the functional 
form of time-varying covariates influences the derived inferences; see, for example, Fisher 
and Lin (1999) and references therein. In the joint modeling framework however, where the 
longitudinal outcome plays the role of a time-dependent covariate for the survival process, 
this topic has received much less attention. The two main functional forms that have been 
primarily used so far in joint models include in the linear predictor of the relative risk 
model (2) either the subject-specific means rriiit) from the longitudinal submodel or just the 
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random effects 6j (Henderson et al. 2000; Rizopoulos and Ghosh 2011). Nonetheless, in our 
setting, where our primary interest is in producing accurate predictions, we expect that the 
link between the longitudinal and event time processes to be important, and therefore it is 
relevant to investigate if and how the accuracy of predictions is influenced by the assumed 
functional form. This is motivated by the fact that there could be other characteristics of 
the patients' longitudinal profiles that are more strongly predictive for the risk of an event, 
such as the rate of increase/decrease of the biomarker's levels or a suitable summary of the 
whole longitudinal trajectory. 

In this section, we present a few examples of alternative formulations for the associa- 
tion structure between the longitudinal outcome and the risk for an event. These different 
parameterizations can be seen as special cases of the following general formulation of the 
relative risk model: 

Kit) = h (t) exp[<y T Wi + a T f{t,b i7 Mi(t)}], 

where /(•) is a time-dependent function that may depend on the random effects and on the 
true longitudinal history, as approximated by the mixed-effects model, and ex. is a vector of 
association parameters. 

4.1 Time-Dependent Slopes 

The standard formulation (2) postulates that the risk for an event at time t is associated 
with parameter a measuring the strength of this association. Even though this is a very 
intuitively appealing parameterization with a clear interpretation for a, it cannot distinguish 
between patients who, for instance, at a specific time point have equal true marker levels, 
but they may differ in the rate of change of the marker, with one patient having an increasing 
trajectory and the other a decreasing one. An extension of (2) to capture such a setting has 
been considered by Ye et al. (2008) who posited a joint model in which the risk depends on 
both the current true value of the trajectory and the slope of the true trajectory at time t. 
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More specifically, the relative risk survival submodel takes the form, 



hi(t) = ho(t) exp{j T Wi + aiTOi(t) + a 2 m[(t)} 



(8) 



where m'^t) = d{xj (t)(3+zj (t)bi}/dt. The interpretation of parameter a\ remains the same 
as in the standard parameterization (2). Parameter a 2 measures the association between 
the value of the slope of the true longitudinal trajectory at time t and the risk for an event 
at the same time point, provided that rrii(t) remains constant. 

4.2 Cumulative Effects 

A common characteristic of both (2) and (8) is that the risk for an event at any time t 
is assumed to be associated with features of the longitudinal trajectory at the same time 
point. However, in ordinary time-dependent Cox models several authors have argued that 
this assumption is over-simplistic, and in many real settings we may benefit from allowing 
the risk to depend on a more elaborate function of the history of the time-varying covariate 
(Sylvestre and Abrahamowicz 2009). Extending this concept in the context of joint models, 
we will account for the cumulative effect of the longitudinal outcome by including in the 
linear predictor of the relative risk submodel the integral of the longitudinal trajectory from 
baseline up to time t. More specifically, the survival submodel takes the form 



where for any particular time point t, a measures the strength of the association between 
the risk for an event at time point t and the area under the longitudinal trajectory up to the 
same time t, with the area under the longitudinal trajectory taken as a suitable summary of 
the whole marker history Aii(t) = {rrii(s), < s < £}. 

A feature of (9) is that it assigns the same weight on all past values of the longitudinal 
trajectory. This may not be reasonable when biomarker values closer to t are considered 
more relevant. An extension that allows placing different weights at different time points 
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is to multiply rrii(t) with an appropriately chosen weight function zv(-) that places different 
weights at different time points, i.e., 



A possible family of functions with this property are probability density functions of known 
parametric distributions, such as the normal, the Student 's-t and the logistic. The scale 
parameter in these densities and the degrees of freedom parameter in the Student 's-i density 
can be used to tune the relative weights of more recent marker values compared to older 
ones. 

4.3 Shared Random Effects 

As mentioned earlier, one of the standard formulations of joint models includes in the linear 
predictor of the risk model only the random effects of the longitudinal submodel, that is, 



where a denotes a vector of association parameters each one measuring the association be- 
tween the corresponding random effect and the hazard for an event. This parameterization is 
more meaningful when a simple random-intercepts and random-slopes structure is assumed 
for the longitudinal submodel, in which case the random effects express subject-specific devi- 
ations from the average intercept and average slope. Under this setting this parameterization 
postulates that patients who have a lower/higher level for the longitudinal outcome at base- 
line (i.e., intercept) or who show a steeper increase/decrease in their longitudinal trajectories 
(i.e., slope) are more likely to experience the event. In that respect, this formulation shares 
also similarities with the time- dependent slopes formulation (8). 

A computational advantage of formulation (11) is that it is time- independent, and there- 
fore leads to a closed-form solution (under certain baseline risk functions) for the integral 
in the definition of the survival function (7). This facilitates computations since we do 




(10) 




(11) 
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not have to numerically approximate this integral. However, an important disadvantage 
of (11) emerges when polynomials or splines are used to capture nonlinear subject-specific 
evolutions, in which case the random effects do not have a straightforward interpretation, 
complicating in turn the interpretation of ex. Nonetheless, in our setting, we are primarily 
interested in predictions and not that much in interpretation, and thus we can consider (11) 
even under an elaborate mixed model. 

5 Bayesian Model Averaging 

The previous section demonstrated that there are several choices for the link between the 
longitudinal and event time outcomes. The common practice in prognostic modeling is to 
base predictions on a single model that has been selected based on an automatic algorithm, 
such as, backward, forward or stepwise selection, or on likelihood-based information criteria, 
such as, AIC, BIC, DIC and their variants. However, what is often neglected in this procedure 
is the issue of model uncertainty. For example, in the scenario that two models are correct, 
model selection forces us to choose one of the models even if we are not certain which model 
is true. In addition, with respect to predictions, there could be several competing models 
that could offer almost equally good predictions or even that some models may produce more 
accurate predictions for some subjects with specific longitudinal profiles, while other models 
may produce better predictions for subjects whose profiles have other features. Here we 
follow another approach and we explicitly take into account model uncertainty by combining 
predictions under different association structures using Bayesian model averaging (BMA) 
(Hoeting et al. 1999). We should stress that in our setting there is no concern in using 
BMA, because we do not average parameters that possibly have different interpretations 
under the different association structures, but rather predictions that maintain the same 
interpretation whatever the chosen functional form. 

Due to space limitations, we will only focus here on dynamic BMA predictions of survival 
probabilities. BMA predictions for the longitudinal outcome can be produced with similar 
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methodology. Following the definitions of Section 3.3, we assume that we have available data 
V n = {Tj, yf, i — 1, . . . , n) based on which we fit Mi, . . . , M K joint models with different 
association structures. Interest is in calculating predictions for a new subject j from the same 
population who has provided a set of longitudinal measurements 3^(t) = {yj(s); < s < t}, 
and has a vector of baseline covariates Wj. We let T>j(t) = {y,(t),T* > t,Wj} denote the 
available data for this subject. The model-averaged probability of subject j surviving time 
u > t, given her survival up to t is given by the expression: 

K 

Pr(r; > u | v&),v n ) = ^Pr(r; > u \ M k ,v 3 {t),v n )p{M k \ v 3 {t),v n ). (12) 

k=l 

The first term in the right-hand side of (12) denotes the model-specific survival probabilities, 
derived as in Section 3.3, and the second term denotes the posterior weights of each of the 
competing joint models. The unique characteristic of these weights is that they depend on 
the observed data of subject j, in contrast to classic applications of BMA where the model 
weights are the same for all subjects. This means that, in our case, the model weights 
are both subject- and time-dependent, and therefore, for different subjects, and even for 
the same subject but at different times points, different models may have higher posterior 
probabilities. Hence, this framework is capable of better tailoring predictions to each subject 
than standard prognostic models, because at any time point we base risk assessments on the 
models that are more probable to describe the association between the observed longitudinal 
trajectory of a subject and the risk for an event. 

For the calculation of the model weights we observe that these are written as: 

P (M k | VMV n ) = P(VM\M k )p(Vn\M k ) P (M k ) 
EPPAt) \M e )p(V n \M e )p(M e ) 

where 

p(Pj(t) | M k ) = J p(Pj(t) | k )p(O k | M k )dO k 
and p(T> n \ M k ) is defined analogously. The likelihood part p(V n \ 6 k ) is given by (6), 
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whereas p(Vj(t) \ Ok) equals 

P (Vj(t) | o k )=p(y j (t) | KOjSjit | buOjpibj | o k ). 

Thus, the subject-specific information in the model weights at time t comes from the available 
longitudinal measurements yj(t) but also from the fact that this subject has survived up 
to t. Closed-form expressions for the marginal densities p(T> n | M^) and p(T>j(t) \ M&) are 
obtained by means of Laplace approximations (Tierney and Kadane 1986) performed in two- 
steps, namely, first integrating out the random effects bi and then the parameters Ok- The 
details are given in the supplementary material. Finally, a priori we assume that all models 
are equally probable, i.e., p(M] t ) — 1/K, for all k — 1, . . . , K. 

6 Analysis of the Aortic Valve Dataset 

We return to the Aortic Valve dataset introduced in Section 2. Our aim here is to derive 
a prediction tool that will utilize all recorded information of a patient to provide accurate 
individualized predictions for both future aortic gradient levels and the risk of re-operation- 
free survival. Following the discussion of Sections 4 and 5, we will compare predictions 
under different association structures between the longitudinal and survival outcomes, with 
the BMA predictions that are based on all association structures simultaneously. 

We start by defining the set of joint models from which predictions will be calculated. 
For the longitudinal process we allow a flexible specification of the subject-specific square 
root aortic gradient trajectories using natural cubic splines of time. More specifically, the 
linear mixed model takes the form 

Vi {t) = ftSIi + /3 2 RRi + fa{B x {t t A) x SIJ + p A {B x {t, A) x RRj 
+ /3 5 {B 2 (t, A) x SI*} + P e {B 2 (t, A) x RRj} 
+ (3 7 {B 3 (t, A) x SIJ + (3 8 {B 3 (t, A) x RR,} 
+ b i0 + b il B 1 (t, X) + b i2 B 2 (t, X) + b i3 B 3 (t, X) + Ei(t), 
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where B n (t, A) denotes the B-spline basis for a natural cubic spline with boundary knots 
at baseline and 19 years and internal knots at 2.1 and 5.5 years (i.e., the 33.3% and 66.6% 
percentiles of observed follow-up times), SI and RR are the dummy variables for the sub- 
coronary implantation and root replacement groups, respectively, Ei(t) ~ A/"(0, a 2 ) and 6, ~ 
J\f(0,D). For the survival process we consider five relative risk models, each positing a 
different association structure between the two processes, namely: 



M 1 : 


h^t) 


= h (t) exp{7 1 RR i + 7 2 Age i + 7 3 Female; + a-im^t)}, 


M 2 : 


K{t) 


= h (t) exp{7 1 RR i + 7 2 Age- + 7 3 Female i + aiin^t) + a 2 m ■(£)}, 


M 3 : 


hi(t) 


= h (t) exp{7iRRi + 7 2 Age- + 7 3 Female; + a x I rrii(s)ds} } 

Jo 


M 4 : 


Kit) 


= /i (t) exp{7iRRj + 7 2 Age^ + 7 3 Femalei + «i / <j){t - s)mi(s)ds) , 

Jo 


M 5 : 


Kit) 


= h (t) exp(7iRR; + 7 2 Age^ + 7 3 Femalej + a^o + a 2 b a + a 3 b i2 + a 4 6 i3 ) , 



where the baseline hazard is approximated with splines, as described in Section 3.1, Female 
denotes the dummy variable for females, and </>(•) denotes the probability density function of 
the standard normal distribution. We fitted each of these joint models using a single chain 
of 115000 MCMC iterations from which we discarded the first 15000 samples as burn-in. All 
computations have been performed in R (version 2.15.2) and JAGS (version 3.3.0). Trace 
and auto-correlations plots did not show any alarming indications of convergence failure. 
Tables 1 and 2 in the supplementary material show estimates and the corresponding 95% 
credible intervals for the parameters in the longitudinal and survival submodels, respectively. 
We observe that the parameter estimates in the relative risk models show much greater 
variability between the posited association structures than the parameters in the linear 
mixed models. However, we should note that the interpretation of the regression coefficients 
7 is not the same in all five models because we condition on different components of the 
longitudinal process. 

We continue with the calculation of dynamic predictions based on these five joint models. 
To mimic the real-life use of a prognostic tool, we calculate predictions for three patients, 
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namely Patients 20, 22 and 81, who have been excluded from the dataset we used to fit 
the joint models. Patient 20 is a female of 64 years old and has received a sub-coronary 
implantation, Patient 22 is a male of 53 and has also received a sub-coronary implantation, 
and Patient 81 is a male of 39 and has undertaken a root replacement. The longitudinal 
trajectories of square root aortic gradient for these patients are illustrated in Figure 3, from 
which we see that Patients 20 and 22 show similar profiles with an initial drop in their 
aortic gradient levels after the operation and up to about five years followed by a stable 
increase for the next five years and a drop again. On the contrary, Patient 81 showed a 
steady increase of aortic gradient for the duration of his follow-up. For each one of those 
subjects we calculate dynamic predictions (i.e., a new prediction after each aortic gradient 
measurement has been recorded) for both the longitudinal and survival outcomes based on 
each of the five fitted joint models. In addition, following the methodology of Section 5, we 
also calculate BMA dynamic predictions for each patient by taking weighted averages of the 
predictions of the five joint models. We show the predictions of re-operation-free survival 
probabilities for Patients 20 and 81 in Figures 4 and 5, respectively. The predictions for 
Patient 22, and the predictions of future square root aortic gradient levels are presented in 
Figures 1-4 in the supplementary material. Similarly to the results in Tables 1 and 2 in 
the supplementary material, we observe that the predictions of survival probabilities seem to 
be more sensitive to the assumed association structure than the predictions of future square 
root aortic gradient levels. Much greater differences can be seen for Patient 81 who showed 
a steeper increase in his aortic gradient levels than the other two patients. Regarding the 
BMA predictions, it is interesting to note that they show some variability and they are not 
always dictated by a single joint model. In particular, Table 1 presents the time-dependent 
subject-specific BMA weights, from which some interesting observations can made. First, 
only models M 2 and M 3 contribute in the BMA predictions of these three patients with the 
weights of the other three models being practically zero. Second, we indeed observe that 
the choice of the most appropriate model changes in time; for example, for Patient 20 and 
using only the first measurement we observe that there is little to choose between models M 2 
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Figure 3: Longitudinal trajectories of square root aortic gradient for Patients 20, 22 and 81 
who have been excluded from the analysis dataset, and for who we calculate predictions. 



and M 3 , when the second measurement is observed then M 2 dominates, whereas when the 
third measurement is recorded prediction is based solely on model M3 (though predictions 
based on the first two measurements are very similar from all five models, more variability 
is observed for latter time points). Similar behavior is observed for the other two patients 
as well. This convincingly demonstrates that it would be not optimal to base predictions on 
a single model for all patients and during the whole follow-up. 
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Figure 4: Dynamic predictions of re-operation-free survival for Patient 20 under the five 
joint models along with the BMA predictions. Each panel shows the corresponding con- 
ditional survival probabilities calculated after each of her longitudinal measurements have 
been recorded. 

7 Simulations 

7.1 Design 

We performed a series of simulations to evaluate the finite sample performance of the BMA 
subject-specific predictions. The design of our simulation study is motivated by the set of 
joint models fitted to the Aortic Valve dataset in Section 6. In particular, we assume 300 
patients who have been followed-up for a period of 19 years, and were planned to provide 
longitudinal measurements at baseline and afterwards at nine random follow-up times. For 
the longitudinal process, and similarly to the model fitted in the Aortic Valve dataset, we 
used natural cubic splines of time with two internals knots placed at 2.1 and 5.5 years, and 
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Figure 5: Dynamic predictions of re-operation- free survival for Patient 81 under the five 
joint models along with the BMA predictions. Each panel shows the corresponding condi- 
tional survival probabilities calculated after each of his longitudinal measurements have been 
recorded. 

boundary knots placed at baseline and 19 years, i.e., the form of the model is as follows 



Vi {t) = ftTrtOj + ^Trtlj + 3 {Bi{t, A) x TrtO,} + ^{B^t, A) x TrtlJ 
+ fc{B 2 (t, A) x TrtO,} + fo{B 2 (t, A) x TrtlJ 
+ fr{B 3 (t, A) x TrtO,} + P 8 {B 3 (t, A) x TrtlJ 
+ b i0 + baB^t, A) + b i2 B 2 (t, A) + b i3 B 3 (t, A) + £i (t), 



where B n (t, A) denotes the B-spline basis for a natural cubic spline, TrtO and Trtl are the 
dummy variables for the two treatment groups, £j(t) ~ A/"(0, a 2 ) and bj ~ A/"(0, D) with D 
taken to be diagonal. 
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Subject Year (Aort.Grad.) 1/2 M 1 M 2 M 3 M 4 M t 



20 2.9 4.0 0.00 0.43 0.57 0.00 0.00 

3.9 3.6 0.00 1.00 0.00 0.00 0.00 

4.9 0.0 0.00 0.02 0.98 0.00 0.00 

7.0 2.6 0.00 0.00 1.00 0.00 0.00 
8.4 3.7 0.00 0.99 0.01 0.00 0.00 
10.3 4.0 0.00 0.00 1.00 0.00 0.00 
12.1 4.6 0.00 0.57 0.43 0.00 0.00 
14.2 3_ 1 6 0.00 0.88 0.12 0.00 0.00 

22 3.1 2.8 0.00 0.95 0.05 0.00 0.00 

4.2 2.4 0.00 1.00 0.00 0.00 0.00 

5.3 1.7 0.00 0.00 1.00 0.00 0.00 
8.3 2.2 0.00 1.00 0.00 0.00 0.00 
10.8 3.0 0.00 1.00 0.00 0.00 0.00 
13.1 4.2 0.00 0.01 0.99 0.00 0.00 
15.0 4.0 0.00 1.00 0.00 0.00 0.00 
17.3 2J3 0.00 0.67 0.33 0.00 0.00 

81 0X~ 3^2~ 0.00 1.00 0.00 0.00 0.00 

1.3 3.6 0.00 0.84 0.16 0.00 0.00 

3.3 5.0 0.00 1.00 0.00 0.00 0.00 

5.3 7.4 0.00 0.17 0.83 0.00 0.00 

7.1 9.8 0.00 0.19 0.81 0.00 0.00 
10.6 10.0 0.00 0.99 0.01 0.00 0.00 



Table 1: BMA posterior weights for the five joint models for each subject and after each 
measurement. 



For the survival process, we have assumed four scenarios, each one corresponding to 
a different functional form for the association structure between the two processes. More 
specifically, 

Scenario I: hi(t) = h (t) exp{7 + 7 1 Trtl i + c^m^i)}, 

Scenario II: hi(t) = h (t) exp{7 + 7 1 Trtl i + aiTOj(i) + a 2 m^(t)}, 

Scenario III: hi(t) = ho(t) exp{7 + 7iTrtlj + a± / TOj(s)ds}, 

Jo 

Scenario IV: hi(t) = h (t) exp(7 + 7iTrtl, + aib i0 + a 2 bn + a 3 b i2 + a^), 

with ho(t) = Ott Ct ~ x , i.e., the Weibull baseline hazard. The values for the regression coef- 
ficients in the longitudinal and survival submodels, the variance of the error terms of the 
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mixed model, the covariance matrix for the random effects, and the scale of the Weibull 
baseline risk function are given in Section 3 of the supplementary material. Censoring times 
were simulated from a uniform distribution in (0, t max ) with t max set to result in about 45% 
censoring in each scenario. For each scenario we simulated 200 datasets. 

7.2 Results 

Motivated by the Aortic Valve dataset to produce accurate risk predictions, but also from 
previous experience regarding the robustness of predictions for the longitudinal outcome on 
the assumed association structure, we have focused on dynamic predictions for the survival 
outcome. More specifically, under each scenario and for each simulated dataset, we randomly 
excluded ten subjects whose event times were censored (it is more meaningful to calculate 
predictions for those individuals since they have not experienced the event yet), and in the 
remaining patients, we fitted five joint models. The longitudinal submodel was the same 
as the one we simulated from. For the survival process we assumed relative risk submodels 
with: the current value term (as in Scenario I), the current value and current slope terms (as 
in Scenario II), the cumulative effect (as in Scenario III), the weighted cumulative effect (10) 
with weight function the probability density function of the standard normal distribution, 
and only including the random effects (as in Scenario IV). Because the aim of this simulation 
study was to investigate how the different association structures affect predictions, we also 
assumed the same Weibull baseline hazard function from which we simulated the data and 
not the B-spline approximated baseline hazard (3) presented in Section 3.1. 

Based on the five fitted joint models, we calculated predictions for each of the ten subjects 
we have originally excluded, at ten equidistant time points between their last available 
longitudinal measurement and the end of follow-up. These predictions were calculated under 
the true model for each scenario, BMA including all five models, and BMA based on four 
models without including the true one for the specific scenario. These predicted survival 
probabilities were then compared with the gold standard survival probabilities, calculated 
as Sj{u | M.j(u, bj), &j/Sj{t \ Aij(t,bj),0}, using the true parameter values and the true 
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values of the random effects for the subjects we excluded. In each simulated dataset and for 
each of the ten subjects, we calculated root mean squared prediction errors (RMSEs) between 
the gold standard survival probabilities and the predictions under the three methods. The 
RMSEs over all the subjects from the 200 datasets are shown in Figure 6. For all scenarios 
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Figure 6: Simulation results under the four scenarios based on 200 datasets. Each box- 
plot shows the distribution of the root mean squared predictions error of the corresponding 
method to compute predictions (i.e., based on the true model, BMA including the true model 
and BMA without the true model) versus the gold standard. 

we observe that in practice the BMA subject-specific predictions perform very well against 
the corresponding predictions from the true model. The greatest differences are observed in 
Scenario IV in which the BMA predictions seem to considerably outperform the true model 
predictions. A more careful examination of this scenario showed that this behavior was due 
to the fact that all models produced on average accurate predictions, but the predictions 
from the true model were much more variable than ones from the other models. In addition, 
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for the scenarios we considered in this study, it seems that BMA works equally well whether 
or not the true model is included in the list of the models that are averaged. This is a 
promising result because it suggests that BMA-based predictions could 'protect' against a 
misspecification of the association structure between the two processes. 

8 Discussion 

This paper illustrated how dynamic predictions from joint models with different association 
structures can be optimally combined using Bayesian model averaging. The novel feature of 
this approach is that the weights for combing predictions depend on the recorded information 
for the subject for whom predictions are of interest. Thus, for different subjects and even for 
the same subject but at different follow-up times, different models may have higher weights. 
This explicitly accounts for model uncertainty and acknowledges that a single prognostic 
model may not be adequate for quantifying the risk of all patients. Our simulation study 
showed that BMA predictions perform very well in comparison with predictions from the 
true model, even if the true model is not included in the list of models that are averaged. 
This gives us more confidence in trusting BMA for deriving predictions for future patients 
from the Aortic Valve study population based on the five joint models we have considered. 

In our developments we have considered a simple joint model for a single longitudinal 
outcome and one time-to-event. However, often in longitudinal studies several outcomes 
are recorded on each patient during follow-up. For example, for the patients enrolled in 
the Aortic Valve study also aortic regurgitation was recorded, which is another measure of 
valve function. Hence it would be of interest to investigate whether by considering both 
longitudinal biomarkers we could improve the accuracy of predictions. In addition, in our 
analysis we have considered the composite event re-operation or death (whatever comes 
first), but for the treating physicians it could be of interest to have separate risk estimates 
for the two events. Based on recent advances in joint modeling that include joint models for 
multiple markers and multiple event times (Huang et al. 2011; Rizopoulos and Ghosh 2011; 
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Liu and Huang 2009), our ideas could be relatively easily extended to these more elaborate 
cases. The challenge in such settings will be the high dimensionality of the model space. 
This is because the number of possible combinations of association structures between the 
longitudinal and the survival processes grows exponentially with the number of outcomes. 

A topic that we have partially addressed in this paper, but which is of high relevance for 
the practicality of prognostic models concerns the external validation of the derived predic- 
tions in terms of both discrimination and calibration. In particular, from our simulations 
we saw that the BMA predictions perform satisfactorily compared to the predictions using 
the true model, but this does not answer the question of how accurately the longitudinal 
outcome can predict the survival one or if it can discriminate between subjects of low and 
high risk. In standard survival analysis there has been a lot of research at these two fronts 
(see e.g., Zheng and Heagerty 2007; Gerds and Schumacher 2006, and references therein), but 
within the joint modeling framework relatively little work has been done (Rizopoulos 2011; 
Proust-Lima and Taylor 2009). Theoretically, all standard measures for calibration (e.g., 
Brier score) and discrimination (time-dependent sensitivity, specificity and ROC curves) can 
be defined for joint models, but their estimation is more challenging. 

9 Supplementary Material 

Supplementary material are available in SuppPredParam.pdf, and include Section 1: Fig- 
ures and Tables with results from the analysis of the Aortic Valve dataset; Section 2: 
Details on the Laplace approximations to calculate p(T> n \ Mk) and p(T>j(t) | Mk); Section 
3: Details on the simulation settings. 
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